IEEE INFOCOM 2021

Program at a Glance

IEEE INFOCOM 2021

Session Opening: Opening, Awards, and Keynote

Session Break-1-May11: Virtual Lunch Break Session N2Women: Virtual N2Women Lunch Meeting Session Break-2-May11: Virtual Coffee Break Session Break-3-May11: Virtual Dinner Break

Session A-1: Privacy 1 Session A-2: Privacy 2

Session B-1: Vehicular Systems Session B-2: UAV Networks

Session C-1: Web Systems Session C-2: Edge and Mobiles

Session D-1: 5G Session D-2: 5G and beyond

Session E-1: Learning and Prediction Session E-2: RL Applications

Session F-1: Edge Computing Session F-2: Edge Analytics

Session G-1: Authentication Session G-2: Blockchain

Session Demo-1: Demo Session 1 Session Demo-2: Demo Session 2

Session Poster-1: Poster Session 1 Session Poster-2: Poster Session 2

Privacy-Preserving Learning of Human Activity Predictors in Smart Environments

Sharare Zehtabian (University of Central Florida, USA); Siavash Khodadadeh (University of Central Flordia, USA); Ladislau Bölöni and Damla Turgut (University of Central Florida, USA)

The daily activities performed by a disabled or elderly person can be monitored by a smart environment, and the acquired data can be used to learn a predictive model of user behavior. To speed up the learning, several researchers designed collaborative learning systems that use data from multiple users. However, disclosing the daily activities of an elderly or disabled user raises privacy concerns.

In this paper, we use state-of-the-art deep neural network-based techniques to learn predictive human activity models in the local, centralized, and federated learning settings. A novel aspect of our work is that we carefully track the temporal evolution of the data available to the learner and the data shared by the user. In contrast to previous work where users shared all their data with the centralized learner, we consider users that aim to preserve their privacy. Thus, they choose between approaches in order to achieve their goals of predictive accuracy while minimizing the shared data. To help users make decisions before disclosing any data, we use machine learning to predict the degree to which a user would benefit from collaborative learning. We validate our approaches on real-world data.

Privacy-Preserving Outlier Detection with High Efficiency over Distributed Datasets

Guanghong Lu, Chunhui Duan, Guohao Zhou and Xuan Ding (Tsinghua University, China); Yunhao Liu (Tsinghua University & The Hong Kong University of Science and Technology, China)

The ability to detect outliers is crucial in data mining, with widespread usage in many fields, including fraud detection, malicious behavior monitoring, health diagnosis, etc. With the tremendous volume of data becoming more distributed than ever, global outlier detection for a group of distributed datasets is particularly desirable. In this work, we propose PIF (Privacy-preserving Isolation Forest), which can detect outliers for multiple distributed data providers with high efficiency and accuracy while giving certain security guarantees. To achieve the goal, PIF makes an innovative improvement to the traditional iForest algorithm, enabling it in distributed environments. With a series of carefully-designed algorithms, each participating party collaborates to build an ensemble of isolation trees efficiently without disclosing sensitive information of data. Besides, to deal with complicated real-world scenarios where different kinds of partitioned data are involved, we propose a comprehensive schema that can work for both horizontally and vertically partitioned data models. We have implemented our method and evaluated it with extensive experiments. It is demonstrated that PIF can achieve comparable AUC to existing iForest on average and maintains a linear time complexity without privacy violation.

CryptoEyes: Privacy Preserving Classification over Encrypted Images

Wenbo He, Shusheng Li and Wenbo Wang (McMaster University, Canada); Muheng Wei and Bohua Qiu (ZhenDui Industry Artificial Intelligence Co, Ltd, China)

With the concern of privacy, a user usually encrypts the images before they were uploaded to the cloud service providers. Classification over encrypted images is essential for the service providers to collect coarse-grained statistical information about the images, therefore offering better services without sacrificing users' privacy. In this paper, we propose CryptoEyes to address the challenges of privacy-preserving classification over encrypted images. We present a two-stream convolutional network architecture for classification over encrypted images to capture the contour of encrypted images, therefore significantly boosting the classification accuracy. By sharing a secret sequence between the service provider and the image owner, CryptoEyes allows the service provider to obtain category information of encrypted images while preventing the unauthorized users from learning it. We implemented and evaluated CryptoEyes on popular datasets and the experimental results demonstrate the superiority of CryptoEyes over existing state of the arts in terms of classification accuracy over encrypted images and better privacy preservation performance.

Privacy Budgeting for Growing Machine Learning Datasets

Weiting Li, Liyao Xiang, Zhou Zhou and Feng Peng (Shanghai Jiao Tong University, China)

The wide deployment of machine learning (ML) models and service APIs exposes the sensitive training data to untrusted and unknown parties, such as end-users and corporations. It is important to preserve data privacy in the released ML models. An essential issue with today's privacypreserving ML platforms is a lack of concern on the tradeoff between data privacy and model utility: a private datablock can only be accessed a finite number of times as each access is privacy-leaking. However, it has never been interrogated whether such privacy leaked in the training brings good utility. We propose a differentially-private access control mechanism on the ML platform to assign datablocks to queries. Each datablock arrives at the platform with a privacy budget, which would be consumed at each query access. We aim to make the most use of the data under the privacy budget constraints. In practice, both datablocks and queries arrive continuously so that each access decision has to be made without knowledge about the future. Hence we propose online algorithms with a worst-case performance guarantee. Experiments on a variety of settings show our privacy budgeting scheme yields high utility on ML platforms.

Session Chair

Athina Markopoulou (U. California Irvine)

AdaPDP: Adaptive Personalized Differential Privacy

Ben Niu (Institute of Information Engineering, Chinese Academy of Sciences, China); Yahong Chen (Institute of Information Engineering, CAS & School of Cyber Security, UCAS, China); Boyang Wang (University of Cincinnati, USA); Zhibo Wang (Zhejiang University, China); Fenghua Li (Institute of Information Engineering, CAS & School of Cyber Security, UCAS, China); Jin Cao (Xidian University, China)

Users usually have different privacy demands when they contribute individual data to a dataset that is maintained and queried by others. To tackle this problem, several personalized differential privacy (PDP) mechanisms have been proposed to render statistical information of the entire dataset without revealing individual privacy. However, existing mechanisms produce query results with low accuracy, which leads to poor data utility. This is primarily because (1) some users are over protected; (2) utility is not explicitly included in the design objective. Poor data utility impedes the adoption of PDP in the real-world applications. In this paper, we present an adaptive personalized differential privacy framework, called AdaPDP. Specifically, to maximize data utility in different cases, AdaPDP adaptively selects underlying noise generation algorithms and calculates the corresponding parameters based on the type of query functions, data distributions and privacy settings. In addition, AdaPDP performs multiple rounds of utility-aware sampling to satisfy different privacy requirements for users. Our privacy analysis shows that the proposed framework renders rigorous privacy guarantee. We conduct extensive experiments on synthetic and real-world datasets to demonstrate the much less utility losses of the proposed framework over various query functions.

Beyond Value Perturbation: Local Differential Privacy in the Temporal Setting

Qingqing Ye (The Hong Kong Polytechnic University, Hong Kong); Haibo Hu (Hong Kong Polytechnic University, Hong Kong); Ninghui Li (Purdue University, USA); Meng Xiaofeng (Renmin University of China, USA); Huadi Zheng and Haotian Yan (The Hong Kong Polytechnic University, Hong Kong)

Time series has numerous application scenarios. However, since many time series data are personal data, releasing them directly could cause privacy infringement. All existing techniques to publish privacy-preserving time series perturb the values while retaining the original temporal order. However, in many value-critical scenarios such as health and financial time series, the values must not be perturbed whereas the temporal order can be perturbed to protect privacy. As such, we propose "local differential privacy in the temporal setting" (TLDP) as the privacy notion for time series data. After quantifying the utility of a temporal perturbation mechanism in terms of the costs of a missing, repeated, empty, or delayed value, we propose three mechanisms for TLDP. Through both analytical and empirical studies, we show the last one, Threshold mechanism, is the most effective under most privacy budget settings, whereas the other two baseline mechanisms fill a niche by supporting very small or large privacy budgets.

PROCESS: Privacy-Preserving On-Chain Certificate Status Service

Meng Jia (School of Cyber Science and Engineering, Wuhan University, China); Kun He, Jing Chen, Ruiying Du and Weihang Chen (Wuhan University, China); Zhihong Tian (Guangzhou University, China); Shouling Ji (Zhejiang University, China & Georgia Institute of Technology, USA)

Clients (e.g., browsers) and servers require public key certificates to establish secure connections. When a client accesses a server, it needs to check the signature, expiration time, and revocation status of the certificate to determine whether the server is reliable. The existing solutions for checking certificate status either have a long update cycle (e.g., CRL, CRLite) or violate clients' privacy (e.g., OCSP, CCSP), and these solutions also have the problem of trust concentration. In this paper, we present PROCESS, an online privacy-preserving on-chain certificate status service based on the blockchain architecture, which can ensure decentralized trust and provide privacy protection for clients. Specifically, we design Counting Garbled Bloom Filter (CGBF) that supports efficient queries and Block-Oriented Revocation List (BORL) to update CGBF timely in the blockchain. With CGBF, we design a privacy-preserving protocol to protect clients' privacy when they check the certificate statuses from the blockchain nodes. Finally, we conduct experiments and compare PROCESS with another blockchain-based solution to demonstrate that PROCESS is suitable in practice.

Contact tracing app privacy: What data is shared by Europe's GAEN contact tracing apps

Douglas Leith and Stephen Farrell (Trinity College Dublin, Ireland)

We describe the data transmitted to backend servers by the contact tracing apps now deployed in Europe with a view to evaluating user privacy. These apps consist of two separate components: a "client" app managed by the national public health authority and the Google/Apple Exposure Notification (GAEN) service, that on Android devices is managed by Google and is part of Google Play Services. We find that the health authority client apps are generally well behaved from a privacy point of view, although the privacy of the Irish, Polish, Danish and Latvian apps could be improved. In marked contrast, we find that the Google Play Services component of these apps is problematic from a privacy viewpoint. Even when minimally configured, Google Play Services still contacts Google servers roughly every 20 minutes, potentially allowing location tracking via IP address. In addition, the phone IMEI, hardware serial number, SIM serial number and IMSI, handset phone number etc are shared with Google, together with detailed data on phone activity. This data collection is enabled simply by enabling Google Play Services, even when all other Google services and settings are disabled, and so is unavoidable for users of GAEN-based contact tracing apps on Android.

Session Chair

Tamer Nadeem (Virginia Commowealth University)

Program at a Glance